23 research outputs found

    Enriching very large ontologies using the WWW

    Full text link
    This paper explores the possibility to exploit text on the world wide web in order to enrich the concepts in existing ontologies. First, a method to retrieve documents from the WWW related to a concept is described. These document collections are used 1) to construct topic signatures (lists of topically related words) for each concept in WordNet, and 2) to build hierarchical clusters of the concepts (the word senses) that lexicalize a given word. The overall goal is to overcome two shortcomings of WordNet: the lack of topical links among concepts, and the proliferation of senses. Topic signatures are validated on a word sense disambiguation task with good results, which are improved when the hierarchical clusters are used.Comment: 6 page

    EDBL: a General Lexical Basis for the Automatic Processing of Basque

    Get PDF
    EDBL (Euskararen Datu-Base Lexikala) is a general-purpose lexical database used in Basque text-processing tasks. It is a large repository of lexical knowledge (currently around 80,000 entries) that acts as basis and support in a number of different NLP tasks, thus providing lexical information for several language tools: morphological analysis, spell checking and correction, lemmatization and tagging, syntactic analysis, and so on. It has been designed to be neutral in relation to the different linguistic formalisms, and flexible and open enough to accept new types of information. A browser-based user interface makes the job of consulting the database, correcting and updating entries, adding new ones, etc. easy to the lexicographer. The paper presents the conceptual schema and the main features of the database, along with some problems encountered in its design and implementation in a commercial DBMS. Given the diversity of the lexical entities and the complex relationships existing among them, three total specializations have been defined under the main class of the hierarchy that represents the conceptual schema. The first one divides all the entries in EDBL into Basque standard and non-standard entries. The second divides the units in the database into dictionary entries (classified into the different parts-of-speech) and other entries (mainly non-independent morphemes and irregularly inflected forms). Finally, another total specialization has been established between single-word entries and multiword lexical units; this permits us to describe the morphotactics of single-word entries, and the constitution and surface realization schemas of multiword lexical units.A hierarchy of typed feature structures (FS) has been designed to map the entities and relationships in the database conceptual schema. The FSs are coded in TEI-conformant SGML, and Feature Structure Declarations (FSD) have been made for all the types of the hierarchy. Feature structures are used as a delivery format to export the lexical information from the database. The information coded in this way is subsequently used as input by the different language analysis tools

    Informatikaren oinarriak. C lengoaian ebatzitako problemak

    Get PDF
    I. Programazioko ariketa laburrak: - Enuntziatuak. - Soluzioen proposamenak. II. Ariketa luzeak: - Gailu elektronikoak. - Kontsumo ohiturak: edariak. - Irratsaioa. - Gasolindegia. - Denda. - Zentral elektrikoa. - Elca Comayor lapurra. - Medi igoerak. - Autopista. - Joko aretoa. - Farolak. - Herrien arteko distantziak. - Olinpiadak. - Ventas S.A. - Heriozko istripuak. - Euskadi Irratia. - San Sebastiana

    Fundamentos de informática. Ejercicios resueltos de programación en C

    Get PDF
    I. Ejercicios cortos de programación: - Enunciados. - Soluciones propuestas. II. Ejercicios largos: - Componentes electrónicos. - Hábitos de consumo: bebidas. - Emisora de radio. - Gasolinera. - Tienda. - Central eléctrica. - Ladrón Elca Comayor. - Cumbres de montaña. - Autopista. - Sala de juegos. - Farolas. - Pueblos de Guipúzcoa. - Olimpiadas. - Ventas S.A. - Accidentes de circulación. - Euskadi Irratia. - San Sebastián
    corecore